Encoding Dissimilarity Data for Statistical Model Building.
نویسنده
چکیده
We summarize, review and comment upon three papers which discuss the use of discrete, noisy, incomplete, scattered pairwise dissimilarity data in statistical model building. Convex cone optimization codes are used to embed the objects into a Euclidean space which respects the dissimilarity information while controlling the dimension of the space. A "newbie" algorithm is provided for embedding new objects into this space. This allows the dissimilarity information to be incorporated into a Smoothing Spline ANOVA penalized likelihood model, a Support Vector Machine, or any model that will admit Reproducing Kernel Hilbert Space components, for nonparametric regression, supervised learning, or semi-supervised learning. Future work and open questions are discussed. The papers are: F. Lu, S. Keles, S. Wright and G. Wahba 2005. A framework for kernel regularization with application to protein clustering. Proceedings of the National Academy of Sciences 102, 12332-1233.G. Corrada Bravo, G. Wahba, K. Lee, B. Klein, R. Klein and S. Iyengar 2009. Examining the relative influence of familial, genetic and environmental covariate information in flexible risk models. Proceedings of the National Academy of Sciences 106, 8128-8133F. Lu, Y. Lin and G. Wahba. Robust manifold unfolding with kernel regularization. TR 1008, Department of Statistics, University of Wisconsin-Madison.
منابع مشابه
Dissimilarity Data in Statistical Model Building and Machine Learning
We explore three papers concerned with two methods for incorporating discrete, noisy, incomplete dissimilarity data into statistical/machine learning models for supervised, semisupervised or unsupervised machine learning. The two methods are RKE (Regularized Kernel Estimation), and RMU (Regularized Manifold Unfolding). Briefly put, the methods use dissimilarity information between objects in a ...
متن کاملClustering, Coding, and the Concept of Similarity
This paper develops a theory of clustering and coding which combines a geometric model with a probabilistic model in a principled way. The geometric model is a Riemannian manifold with a Riemannian metric, gij(x), which we interpret as a measure of dissimilarity. The probabilistic model consists of a stochastic process with an invariant probability measure which matches the density of the sampl...
متن کاملRepresentational models: A common framework for understanding encoding, pattern-component, and representational-similarity analysis
Representational models specify how activity patterns in populations of neurons (or, more generally, in multivariate brain-activity measurements) relate to sensory stimuli, motor responses, or cognitive processes. In an experimental context, representational models can be defined as hypotheses about the distribution of activity profiles across experimental conditions. Currently, three different...
متن کاملReceptive Field Encoding Model for Dynamic Natural Vision
Introduction: Encoding models are used to predict human brain activity in response to sensory stimuli. The purpose of these models is to explain how sensory information represent in the brain. Convolutional neural networks trained by images are capable of encoding magnetic resonance imaging data of humans viewing natural images. Considering the hemodynamic response function, these networks are ...
متن کاملA New Cluster Isolation Criterion Based on Dissimilarity Increments
This paper addresses the problem of cluster defining criteria by proposing a model-based characterization of interpattern relationships. Taking a dissimilarity matrix between patterns as the basic measure for extracting group structure, dissimilarity increments between neighboring patterns within a cluster are analyzed. Empirical evidence suggests modeling the statistical distribution of these ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Journal of statistical planning and inference
دوره 140 12 شماره
صفحات -
تاریخ انتشار 2010